Regular expression reference
In auto-redaction, you can supply a single word or regular expression (regex) that Epiq Discovery matches and automatically redacts in the selected document images. A regex provides a pattern to match instead of matching a literal word. For example, the following RegEx finds email addresses from a specific company (name@epiqglobal.com).
Example: \w+@epiqglobal.com
To find an email address that begins with firstname.lastname, use the following example. This example specifies the following pattern: first name with any number of characters, a literal period (.), last name with any number of characters, a literal at symbol (@), any number of characters, another literal period (.), and three characters at the end.
Example: \w+\.\w+@\w+\.\w{3}
Use the regex elements in the following table to construct patterns for auto-redaction.
String | Matches text that contains | Example | Possible results |
---|---|---|---|
a literal word | the supplied single word. Use alone or combine with other elements. |
Private (HR) department |
Private HR department |
Metacharacter | Matches text that contains | Example | Possible results |
. | an alphabetic character, number, or symbol. |
a.z loc....n |
a_z location |
\d | a number from 0 to 9. |
\d \d{5}(-\d{4}) |
7 66213 (zip codes) |
\D | a character that is not a number. |
\D\D\D \D |
AbC % |
\w | a number, letter, or underscore. |
invoice \w-\w\w\w \w\w\w\w |
invoice A-5_1 D234 |
\W | a symbol, but not a letter or number. |
\W \W\W\W |
$ *-+= |
\s | a white space character, like tab, space, or carriage return. | a\sb\sc | a b c |
\S | a character that is not a white space character (tab, space, or carriage return). | \S\S\S | you |
\ and a symbol | a literal Regular Expression reserved character, such as: \ . { } + ( ) * ? [ ] ^ $ |. Precede the reserved character with a backslash as the escape character. |
a\.c \.\*\? |
a.c .*? |
Quantifier | Matches text that contains | Example | Possible results |
* | the preceding character, 0 or more times. |
a*b*c* misspell* |
aaacccc misspel or misspelll |
+ | the preceding character, 1 or more times. |
at+orney \d+\.\d\d
|
attorney 10.00 (two digit, two decimal number) |
? | the preceding character, either 0 or 1 times. |
plurals? honou?r |
plurals or plural honor or honour |
{n} | the preceding character or group for the specified number (n) of times. |
a{3} \d{5} |
aaa 66213 |
{n,} | the preceding character or group the specified number (n) of times or more. | A{3,} | AAAAAA |
{n,m} | the preceding character or group the specified number (n) of times, but not more than the maximum (m) times. | \d{2,4} | 19, 198, or 1984 |
OR | Matches text that contains | Example | Possible results |
| | text on either side of the pipe symbol, which behaves similar to an OR operand. |
22|33 trade(off|in) |
22 or 33 tradeoff or tradein |
Group/Choice | Matches text that contains | Example | Possible results |
[ ... ] | a character or number listed in the brackets. |
[ABC] [123] |
A, B, or C 1, 2, or 3 |
[n - x] | a character or number in the supplied range, regardless of order. No more than one match can occur. |
[a-z] \+[0-9]{11} |
b +14528281111 |
[^n] | any character or number other than those listed. |
[^a] [^a-y] |
b z |
( ) | all of the supplied characters or numbers. |
a(dmit) ..(465) |
admit br465 |
Boundary | Matches non-printable characters | Example | Possible results |
^ | at the beginning of the extracted text when used outside of square brackets. | ^abc | abc (at the start) |
$ | at the end of the extracted text. | end$ | end (at the end) |